Mark Dunning - The University of Sheffield
an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex
No matter how much of the analysis is automated, some manual steps are inevitably involved
| Patient ID | Sex | Date of Diagnosis | Tumour Size |
|---|---|---|---|
| 1 | M | 01-01-2013 | 3.1 |
| 2 | f | 04-18-1998 | 1.5 |
| 3 | Male | 1st of April 2004 | 105 |
| 4 | Female | NA | 67 |
| 5 | F | 2010/03/12 | 4.2 |
| 6 | F | 3.6 | |
| 7 | M | 1994-11-05T08:15:30-05:00 | 232 |
credit: @myusuf3
| Patient ID | Sex | Date of Diagnosis | Tumour Size |
|---|---|---|---|
| 001 | M | 2013-01-01 | 3.1 |
| 002 | F | 1998-04-18 | 1.5 |
| 003 | M | 2004-04-01 | 1.05 |
| 004 | F | NA | 0.67 |
| 005 | F | 2010-03-12 | 4.2 |
| 006 | F | NA | 3.6 |
| 007 | M | 1994-11-05 | 2.32 |
NA is Ok, but what if NA is a valid category in your data?
NA as a missing value and can ignore it in calculations| Patient ID | Date | Value |
|---|---|---|
| 1 | 2015-06-14 | 213 |
| 2 | 76.5 | |
| 3 | 2015-06-18 | 32 |
| 4 | 120.3 | |
| 5 | 109 | |
| 6 | 2015-06-20 | |
| 7 | 143 |
Fill in all the cells
| Patient ID | Date | Value |
|---|---|---|
| 1 | 2015-06-14 | 213 |
| 2 | 2015-06-14 | 76.5 |
| 3 | 2015-06-18 | 32 |
| 4 | 2015-06-18 | 120.3 |
| 5 | 2015-06-18 | 109 |
| 6 | 2015-06-20 | NA |
| 7 | 2015-06-20 | 143 |
Make it rectangle
Computer doesn’t recognize it!
patient-data.xlsx and open in Excel, or equivalent software